REPRESENT FIRST MOLECULE AS A SET 
OF ATOMS WITH ASSOCIATED SCALAR 
DESCRIPTORS DERIVED FROM INTERATOMIC 
DISTANCES IN SAID FIRST MOLECULE 
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REPRESENT SECOND MOLECULE AS A SET 

OF ATOMS WITH ASSOCIATED SCALAR 
DESCRIPTORS DERIVED FROM INTERATOMIC 
DISTANCES IN SAID SECOND MOLECULE 
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ASSESS MOLECULAR SIMILARITY ^/e 
BY COMPARING THE MOLECULAR 
REPRESENTATIONS 



FIG. 1 
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ID-DERIVATION MODULE 
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STORAGE AND 
RETRIEVAL MODULE 



DATABASE 
OF 
LINEAR 
REPRESENTATIONS 
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COMPARISON MODULE 
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FIG. 2 
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FIG. 4 
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Bin-Based Overlap 

Do a series of fast overlap calculations using "bins" 
with integer occupation numbers (0->255) for each 
atom: 
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Multiply occupation numbers for matching atom types 
across aligned bins to get a good estimate of overlap area 
Fast, but there are numerous bin-based offsets that must 
be considered 



Speeding Up Bin-Based Overlap Calculations 
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-21 unique bin offsets, 10 matching atom type pairs 
-There are only 6 different bin offsets wherein matching 
atom types are approximately aligned: 



offset = -5 -dbin 
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offset = 0 




overlap * 3 



etc. 
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Approximate Bin-Based Overlaps -» Upper Bounds 



Colors indicate 
which atom types 
contribute each 
unit of approx. 
overlap / 
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Combine totals from 
nearby offsets to get 
strict upper bounds on 
bin-based overlaps 
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Bin Offset 
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- Process offsets in order of decreasing upper bound 

- Do standard bin-based overlap calculations (with occupation 
numbers), keeping track of the largest overlap value 

- Stop when remaining upper bounds are lower than this largest 
bin-based overlap 
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